Skip to content

OpenClaw 企业级部署完整方案

发布日期: 2026-03-18
分类: 企业解决方案
标签: OpenClaw, 企业,部署,高可用,安全,运维


场景:100 人团队的企业部署

需求

  • 用户规模: 100 人,日均活跃 60 人
  • 并发量: 峰值 20 并发会话
  • 可用性: 99.9%(每月宕机 <43 分钟)
  • 安全性: 数据加密、访问控制、审计日志
  • 扩展性: 支持未来扩展到 500 人

挑战

  1. 单点故障风险
  2. 性能瓶颈
  3. 数据安全
  4. 运维复杂度
  5. 成本控制

架构设计

整体架构

                    ┌─────────────────┐
                    │   负载均衡器    │
                    │   (Nginx HA)    │
                    └────────┬────────┘

              ┌──────────────┼──────────────┐
              │              │              │
     ┌────────▼────────┐    │    ┌────────▼────────┐
     │  OpenClaw 节点 1 │    │    │  OpenClaw 节点 2 │
     │   (主节点)      │    │    │   (备节点)      │
     └────────┬────────┘    │    └────────┬────────┘
              │             │             │
              └─────────────┼─────────────┘

                  ┌─────────▼─────────┐
                  │   Redis 集群      │
                  │  (会话存储)       │
                  └─────────┬─────────┘

                  ┌─────────▼─────────┐
                  │   PostgreSQL      │
                  │  (数据持久化)     │
                  └─────────┬─────────┘

                  ┌─────────▼─────────┐
                  │   对象存储        │
                  │  (文件/日志)      │
                  └───────────────────┘

技术选型

组件选型说明
负载均衡Nginx + Keepalived高可用
应用服务器OpenClaw × 2主备模式
会话存储Redis Cluster分布式缓存
数据库PostgreSQL 15数据持久化
对象存储MinIO自建 S3 兼容
监控Prometheus + Grafana指标监控
日志ELK Stack日志分析
部署Docker + K8s容器化

部署步骤

步骤 1:服务器规划

服务器配置用途数量
app-server4C8GOpenClaw 应用2
db-server8C16GPostgreSQL2 (主从)
cache-server4C8GRedis3 (集群)
storage-server8C32GMinIO3 (分布式)
lb-server2C4GNginx2 (主备)

云服务商推荐

  • 阿里云:经济实惠
  • 腾讯云:性价比高
  • AWS:全球部署

步骤 2:Docker 容器化

dockerfile
# Dockerfile
FROM node:22-alpine

WORKDIR /app

# 安装 OpenClaw
RUN npm install -g openclaw

# 复制配置
COPY openclaw.json /root/.openclaw/
COPY workspace/ /root/.openclaw/workspace/

# 健康检查
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
  CMD openclaw gateway status || exit 1

# 启动
CMD ["openclaw", "gateway", "start", "--foreground"]
yaml
# docker-compose.yml
version: '3.8'

services:
  openclaw-primary:
    build: .
    container_name: openclaw-primary
    environment:
      - NODE_ENV=production
      - OPENCLAW_CONFIG=/root/.openclaw/openclaw.json
    volumes:
      - ./config:/root/.openclaw
      - ./workspace:/root/.openclaw/workspace
      - ./logs:/root/.openclaw/logs
    networks:
      - openclaw-net
    deploy:
      replicas: 1
      restart_policy:
        condition: on-failure
    healthcheck:
      test: ["CMD", "openclaw", "gateway", "status"]
      interval: 30s
      timeout: 10s
      retries: 3

  openclaw-replica:
    build: .
    container_name: openclaw-replica
    environment:
      - NODE_ENV=production
    volumes:
      - ./config:/root/.openclaw
      - ./workspace:/root/.openclaw/workspace
      - ./logs:/root/.openclaw/logs
    networks:
      - openclaw-net
    depends_on:
      - openclaw-primary
    deploy:
      replicas: 1

  redis:
    image: redis:7-alpine
    command: redis-server --appendonly yes
    volumes:
      - redis-data:/data
    networks:
      - openclaw-net

  postgres:
    image: postgres:15-alpine
    environment:
      - POSTGRES_DB=openclaw
      - POSTGRES_USER=openclaw
      - POSTGRES_PASSWORD=${DB_PASSWORD}
    volumes:
      - postgres-data:/var/lib/postgresql/data
    networks:
      - openclaw-net

  nginx:
    image: nginx:alpine
    ports:
      - "80:80"
      - "443:443"
    volumes:
      - ./nginx.conf:/etc/nginx/nginx.conf
      - ./ssl:/etc/nginx/ssl
    networks:
      - openclaw-net
    depends_on:
      - openclaw-primary
      - openclaw-replica

volumes:
  redis-data:
  postgres-data:

networks:
  openclaw-net:
    driver: bridge

步骤 3:Nginx 负载均衡配置

nginx
# nginx.conf
upstream openclaw_backend {
    least_conn;
    server openclaw-primary:8080 weight=3;
    server openclaw-replica:8080 weight=2;
    keepalive 32;
}

server {
    listen 80;
    server_name openclaw.company.com;
    return 301 https://$server_name$request_uri;
}

server {
    listen 443 ssl http2;
    server_name openclaw.company.com;

    ssl_certificate /etc/nginx/ssl/fullchain.pem;
    ssl_certificate_key /etc/nginx/ssl/privkey.pem;
    ssl_protocols TLSv1.2 TLSv1.3;
    ssl_ciphers HIGH:!aNULL:!MD5;

    # 安全头
    add_header X-Frame-Options "SAMEORIGIN" always;
    add_header X-Content-Type-Options "nosniff" always;
    add_header X-XSS-Protection "1; mode=block" always;
    add_header Strict-Transport-Security "max-age=31536000" always;

    location / {
        proxy_pass http://openclaw_backend;
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection "upgrade";
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
        
        # 超时设置
        proxy_connect_timeout 60s;
        proxy_send_timeout 60s;
        proxy_read_timeout 60s;
        
        # 缓冲设置
        proxy_buffering off;
    }

    # 健康检查端点
    location /health {
        access_log off;
        return 200 "healthy\n";
        add_header Content-Type text/plain;
    }
}

步骤 4:高可用配置

Keepalived 配置

bash
# /etc/keepalived/keepalived.conf (主节点)
vrrp_script check_nginx {
    script "killall -0 nginx"
    interval 2
    weight 2
}

vrrp_instance VI_1 {
    state MASTER
    interface eth0
    virtual_router_id 51
    priority 100
    advert_int 1

    authentication {
        auth_type PASS
        auth_pass 1111
    }

    virtual_ipaddress {
        192.168.1.100
    }

    track_script {
        check_nginx
    }
}
bash
# /etc/keepalived/keepalived.conf (备节点)
vrrp_instance VI_1 {
    state BACKUP
    interface eth0
    virtual_router_id 51
    priority 90
    advert_int 1

    authentication {
        auth_type PASS
        auth_pass 1111
    }

    virtual_ipaddress {
        192.168.1.100
    }
}

步骤 5:数据备份策略

bash
#!/bin/bash
# backup.sh - 自动备份脚本

BACKUP_DIR="/backups/openclaw"
DATE=$(date +%Y%m%d_%H%M%S)
RETENTION_DAYS=30

# 备份数据库
pg_dump -h localhost -U openclaw openclaw | gzip > $BACKUP_DIR/db_$DATE.sql.gz

# 备份配置文件
tar -czf $BACKUP_DIR/config_$DATE.tar.gz \
    /root/.openclaw/openclaw.json \
    /root/.openclaw/workspace/

# 备份凭证(加密)
gpg --symmetric --cipher-algo AES256 \
    --batch --passphrase "$BACKUP_PASSPHRASE" \
    /root/.openclaw/credentials/*.json \
    --output $BACKUP_DIR/credentials_$DATE.json.gpg

# 上传到对象存储
aws s3 cp $BACKUP_DIR s3://openclaw-backups/$DATE/ \
    --recursive --endpoint-url http://minio:9000

# 清理旧备份
find $BACKUP_DIR -name "*.gz" -mtime +$RETENTION_DAYS -delete

echo "✅ 备份完成:$DATE"
json
{
  "name": "每日备份",
  "schedule": {
    "kind": "cron",
    "expr": "0 2 * * *",
    "tz": "Asia/Shanghai"
  },
  "payload": {
    "kind": "systemEvent",
    "text": "执行 /opt/openclaw/backup.sh"
  }
}

监控告警

Prometheus 配置

yaml
# prometheus.yml
global:
  scrape_interval: 15s

scrape_configs:
  - job_name: 'openclaw'
    static_configs:
      - targets: ['openclaw-primary:8080', 'openclaw-replica:8080']
    metrics_path: '/metrics'

  - job_name: 'nginx'
    static_configs:
      - targets: ['nginx:9113']

  - job_name: 'redis'
    static_configs:
      - targets: ['redis:6379']

  - job_name: 'postgres'
    static_configs:
      - targets: ['postgres:9187']

Grafana 仪表板

关键指标:

  1. 应用层

    • 请求量 (QPS)
    • 响应时间 (P95, P99)
    • 错误率
    • 活跃会话数
  2. 系统层

    • CPU 使用率
    • 内存使用率
    • 磁盘使用率
    • 网络流量
  3. 业务层

    • 日活跃用户
    • API 调用量
    • Token 消耗量
    • 成本统计

告警规则

yaml
# alert_rules.yml
groups:
  - name: openclaw_alerts
    rules:
      - alert: HighErrorRate
        expr: rate(http_requests_total{status=~"5.."}[5m]) > 0.05
        for: 5m
        labels:
          severity: critical
        annotations:
          summary: "错误率过高"
          description: "错误率超过 5%"

      - alert: HighResponseTime
        expr: histogram_quantile(0.95, rate(http_request_duration_seconds_bucket[5m])) > 2
        for: 10m
        labels:
          severity: warning
        annotations:
          summary: "响应时间过长"
          description: "P95 响应时间超过 2 秒"

      - alert: HighMemoryUsage
        expr: (node_memory_MemTotal_bytes - node_memory_MemAvailable_bytes) / node_memory_MemTotal_bytes > 0.9
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "内存使用率过高"
          description: "内存使用率超过 90%"

      - alert: InstanceDown
        expr: up{job="openclaw"} == 0
        for: 1m
        labels:
          severity: critical
        annotations:
          summary: "实例宕机"
          description: "{{ $labels.instance }} 已宕机"

安全加固

1. 网络安全

bash
# 防火墙配置
ufw default deny incoming
ufw default allow outgoing
ufw allow 22/tcp      # SSH
ufw allow 443/tcp     # HTTPS
ufw allow from 10.0.0.0/8 to any port 5432  # 内网访问数据库
ufw allow from 10.0.0.0/8 to any port 6379  # 内网访问 Redis
ufw enable

2. 访问控制

json
{
  "security": {
    "authRequired": true,
    "allowedIPs": ["10.0.0.0/8", "192.168.0.0/16"],
    "rateLimit": {
      "requests": 100,
      "windowMs": 60000
    },
    "sessionTimeout": 3600000
  }
}

3. 审计日志

javascript
// 审计日志配置
{
  "logging": {
    "level": "info",
    "audit": {
      "enabled": true,
      "events": [
        "login",
        "logout",
        "config_change",
        "credential_access",
        "data_export"
      ],
      "retention": 90
    }
  }
}

成本估算

月度成本(100 人团队)

项目配置费用
应用服务器4C8G × 2800 元
数据库8C16G × 21600 元
缓存4C8G × 31200 元
存储8C32G × 32400 元
负载均衡2C4G × 2400 元
大模型 API10 万次调用3000 元
带宽100Mbps500 元
总计9900 元/月

人均成本

  • 9900 元 ÷ 100 人 = 99 元/人/月

ROI 分析

假设每人每天节省 1 小时,时薪 50 元:

  • 月度节省:100 人 × 22 天 × 1 小时 × 50 元 = 110,000 元
  • 投入成本:9,900 元
  • ROI: 1011%

运维手册

日常巡检

bash
#!/bin/bash
# daily_check.sh

echo "=== OpenClaw 日常巡检 ==="

# 1. 服务状态
echo "1. 服务状态:"
docker ps | grep openclaw

# 2. 资源使用
echo "2. 资源使用:"
top -bn1 | head -10

# 3. 磁盘空间
echo "3. 磁盘空间:"
df -h

# 4. 错误日志
echo "4. 最近错误:"
tail -50 /var/log/openclaw/error.log | grep ERROR

# 5. 备份状态
echo "5. 备份状态:"
ls -lt /backups/openclaw/ | head -5

故障处理

场景 1:主节点宕机

bash
# 1. 检查状态
kubectl get pods

# 2. 查看日志
kubectl logs openclaw-primary

# 3. 重启服务
kubectl rollout restart deployment/openclaw

# 4. 验证恢复
curl https://openclaw.company.com/health

场景 2:数据库连接失败

bash
# 1. 检查数据库状态
kubectl get pods | grep postgres

# 2. 检查连接
psql -h localhost -U openclaw -d openclaw -c "SELECT 1"

# 3. 查看数据库日志
kubectl logs postgres-primary

# 4. 重启数据库
kubectl rollout restart statefulset/postgres

总结

企业级部署关键要点:

  1. 高可用架构 - 负载均衡 + 主备节点
  2. 数据持久化 - 数据库 + 对象存储
  3. 监控告警 - 实时监控 + 自动告警
  4. 安全加固 - 网络隔离 + 访问控制 + 审计
  5. 备份策略 - 自动备份 + 异地存储
  6. 运维手册 - 标准化流程

适用场景:50 人以上团队、生产环境、关键业务


相关文档:

  • [安全最佳实践](/guide/daily-openclaw-安全实践 -20260318)
  • [性能优化指南](/guide/daily-openclaw-性能优化 -20260318)
  • [成本优化实战](/guide/daily-openclaw-成本优化实战 -20260318)

Released under the MIT License.